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Artificial promoter libraries for microorganisms 



This invention concerns artificial promoter libraries and 
a method of constructing an artificial promoter library 
5 for a selected microorganism or group of microorganisms . 
The invention also concerns a method of optimizing the 
expression of a recombinant heterologous gene in a se- 
lected microorganism by use of such an artificial pro- 
moter library for that microorganism. In connection with 
lo this invention the term "microorganism" shall be taken 
broadly to include prokaryotic organisms such as bacteria 
as "well as eukaryotic microorganisms such as yeasts, 
other fungi and cell lines of higher organisms. 

15 BACKGROUND OF THE INVENTION 

Metabolic engineering of microorganisms is still in its 
"infancy" with respect to industrial applications, de- 
spite the fact that genetic engineering has now been fea- 
20 sible for more than a decade. To a large extent, this may 
be due to the disappointing outcome of many of the at- 
tempts so far to improve strain performance. There are at 
least two reasons for the negative outcome of the at- 
tempts to increase metabolic fluxes: 

25 

One is that the ur-Mietic engineer tends to overlook the 
subtlety of control and regulation of cellular metabo- 
lism. The expression of enzymes that are expected to be 
"rate .limiting" are increased 10 to 100 fold, e.g. by 

M) placing the gene on a high copy number plasmid. Or, a 
branching flux in a pathway is eliminated by deleting a 
gene. Quite often, this will have secondary effects on 
the metabolism, for instance by lowering metabolite con- 
centrations that are essential to other parts of the cel- 

35 lular metabolism (e.g. processes that are essential to 
the growth of the organsim) and the net result may be 
that the overall performance of the cell with respect to 



the desired product is decreased. Instead, it is neces- 
sary, to ,; tune" the expression of the relevant gene around 
the normal expression level and determine the optimal ex- 
pression level, as the level that maximizes the flux. 

The second reason for the negative outcome lies in the 
"rate limiting** concept itself: both metabolic control 
theory (Kacser and Burns, 1973) and experimental determi- 
nations of control by individual steps in a pathway 
(Schaaff et al . , 1989 ; Jensen et al . , 1993 ) have shown 
that reaction steps which were expected to be "rate lim- 
iting" with respect to a particular flux, turned out to 
have no or very little control over the flux. Instead, 
the control and regulation of the cellular metabolism 
turned out to be distributed over several enzymes in a 
pathway, and it may be necessary to enhance the expres- 
sion of several enzymes in order to obtain a higher flux. 

According to metabolic control theory, the total flux 
control exerted by all the enzymes in a pathway, should 
always sum up to 1. Therefore, after one enzyme concen- 
tration has been optimized, the flux control will have 
shifted to another enzyme(s), and it may then be useful 
to perform add i t ional rounds of enzyme optimization in 
order to increaso i he flux further. 

In summary, flux optimization requires 1) fine-tuning of 
enzyme concentrat ion rather than many fold overexpress ion 
and often ?) optimization of the level of several enzymes 
in a pathway rather than looking for the "rate limiting" 
step. 

There are now many systems available that allow one to 
increase the gene expression more than 1000 fold and/or 
to turn on gene expression at a particular time point 
during a fermentation process (e.g. using temperature in- 
ducible systems). With respect to tuning gene expression 



in the fermenter, to say 150% or 70% of the normal ex- 
pression level, it becomes more difficult. In principle, 
one could use a iac-type promoter in front of the gene of 
interest, and then add a certain amount of an inducer of 
the lac system, for instance IPTG ( isopropyl-B-D- 
thiogalactoside ) , or use a temperature sensitive system 
at the correct temperature. These possibilities are often 
not practical for large scale industrial applications. 
The alternative is to use a promoter that has exactly the 
right strength. However, such promoters are seldom avail- 
able, and furthermore one needs a range of promoter ac- 
tivities in order to optimize the expression of the gene 
in the first place, see below. 

SUMMARY OF THE INVENTION 

The present invention provides an artificial promoter li- 
brary for a selected microorganism or group of microor- 
ganisms, comprising a mixture of double stranded DNA 
fragments the sense strands of which comprise at least 
one consensus sequence or part(s) thereof of efficient 
promoters from said microorganism or group of microorgan- 
isms and surrounding or intermediate nucleotide sequences 
(spacers) of variable length in which the nucleotides are 
selected randomly among the nucleobases A , T, C and G. 

The sense strands of the double stranded DNA fragments 
may also include a regulatory DNA sequence imparting a 
specific regulatory feature to the promoters of the li- 
brary. Such specific regulatory feature is preferably ac- 
tivation by a change in the growth conditions, such as a 
change in the pH, osmolarity, temperature or growth 
phase . 

For cloning purposes the double stranded DNA fragments 
usually have sequences comprising one or more recognition 
sites for restriction, endonucleases added to their ends; 



most conveniently sequences specifying multiple recogni- 
tion* sites for restriction endonucleases (multiple clon- 
ing sites MCS ) • 

The selected microrganism or group of microorganisms may 
be selected from prokaryotes and from eukaryotic microor- 
ganisms such as yeasts, other fungi and cell lines of 
higher organisms. 

An interesting group of prokaryotes i.a. in the dairy in- 
dustry consist of lactic acid bacteria of the genus Lac- 
tocOccus (lactococci) , in particular strains of the spe- 
cies Lactococcus lactis. 

In an artificial promoter library for lactococci said 
consensus sequences should comprise the -35 signal (-35 
to -30): TTGACA or the -10 signal (-12 to -7): TATAAT or 
a part of one of them comprising at least 4 conserved nu- 
cleotides . 

More efficient promoters are usually obtained when said 
consensus sequences comprise the -35 signal (-35 to -30): 
TTGACA and the -10 signal (-12 to -7): TATAAT or parts of 
both comprising together at least 6 conserved nucleo- 
tides; and the most efficient promoters are obtained when 
said consensus sequences further comprise intervening 
conserved motifs, e.g. selected from the conserved motifs 
-44 to -4 1: AGTT, -40 to -36: TATTC and +1 to +8: 
GTACTGTT . 

In such promoters the length of the spacer between the 
-35 signal and the -10 signal shoyuld be 14-23 bp, pref- 
erably 16-18 bp, and more preferably 17 bp. 

An interesting eukaryotic microorganism is the yeast spe- 
cies Saccharoniyces cerevisiae, normal baker's yeast. 
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In Saccharomyces cerevisiae said consensus sequences 
should normally comprise a transcription initiation sig- 
nal (TI box) functioning in Saccharomyces cerevisiae . In 
most cases they should also comprise a TATA box and/or at 
5 least one upstream activation sequence ( UAS ) upstream of 
the TI box. 

In a specific embodiment of an artificial promoter li- 
brary according to the invention for Saccharomyces cere- 
It) visiae said consensus sequences comprise the TI box: 
CTCTTAAGTGCAAGTGACTGCGA, which also functions as the 
binding site for the arginine repressor, argR f the TATA 
box: TATAAA , and the UAS G CN4p : TGACTCA . 

15 The present invention also provides a method of con- 
structing an artificial promoter library for a selected 
microorganism or group of microorganisms, which comprises 
selecting at least one consensus sequence of efficient 
promoters from said microorganism or group of microorgan- 

20 isms; synthesizing a mixture of single stranded DNA se- 
quences comprising said consensus sequence(s) or part(s) 
thereof and surrounding or intermediate nucleotide se- 
quences (spacers)of variable length in which the nucleo- 
tides are selected randomly among the nucleobases A, T, C 

25 and G; and converting the single stranded DNA sequences 
into double stranded DNA fragments by second strand syn- 
thesis . 

In order to obtain an artificial promoter library which 
3(> is susceptible to regulation, the single stranded DNA se- 
quences which are synthesized may include a regulatory 
DNA sequence imparting a specific regulatory feature to 
the promoters of the library. Such specific regulatory 
feature is preferably activation by a change in the 
35 growth conditions, such as a change in the pH, osmolar- 
ity, temperature or growth phase. 
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Also, in order to obtain an artificial promoter library 
suitable for cloning, sequences specifying one or more 
recognition sites for restriction endonuc leases may be 
added to the ends of the single stranded DNA sequences in 
5 the synthesis, or linkers comprising such restriction 
sites may be ligated to the ends of the double stranded 
DNA fragments. Most conveniently such sequences specify 
multiple recognition sites for restriction endonucleases 
(multiple cloning sites MCS ) . 

I() 

The selected microorganisms for which artificial promoter 
libraries can be prepared by the method according to the 
invention and the various degenerated sequences to be 
chosen for the promoter libraries of specific microorgan- 
15 isms are the same as discussed above for the artificial 
promoter libraries per se. 

With respect to possible uses of the artificial promoter 
libraries described above, the invention further provides 

2u a method of optimizing the expression of a recombinant 
heterologous gene in a selected microorganism, which com- 
prises transforming or transfecting the microorganism 
with a set of vectors each including said heterologous 
gene under the control of at least one member of an arti- 

25 ficial promoter library according to any one of claims 1- 
24 or constructed by the method according to any one of 
claims 25-48, said set of vectors covering a wide range 
of promoter activities in relatively small steps, growing 
the selected clones and screening them to find the one 

M) showing maximum metabolic flux of the product encoded by 
said heterologous gene. 

DETAILED DESCRIPTION OF THE INVENTION 

35 The features that make a promoter function efficiently in 
a particular microorganism are the consensus sequences 
(e.g. the -10 region, -35 region, etc.) and the optimal 



distances between these. According to the literature, by 
including these elements, the resulting promoters would 
tend to become strong, and the sequence of the spacers 
between the consensus sequences should be less important 
for the strength of the promoters. In principle, modula- 
tion of the strength of these promoters could then be 
achieved by base pair changes in the consensus sequences. 
The promoter libraries that we wish to construct should 
cover a range of promoter activity from, say less than 1 
up to several thousand relative units, in small steps, 
for example an increase in activity by 50-100% per step. 
But "the impact of changes in the consensus sequences on 
the promoter strength will tend to be large because rela- 
tively few base pairs determine the strength, and there- 
fore it will not be feasible to achieve these small steps 
of strength modulation. 

We have used a different approach to obtain sets of pro- 
moters, the strength of which covers (in small steps) the 
range of gene expressions that could become of interest. 
First, degenerated oligonucleotides, approximately 100 
nucleotides long, are synthesized for the. microorganism 
for which one wishes to construct a promoter library. The 
sequences of the oligonucleotides are written by combin- 
ing as much of tho available knowledge from the litera- 
ture as possible, on the features that makes a promoter 
function efficiently in that particular microorganism. 
Secondly, the single stranded oligonucleotides are con- 
verted into doubio stranded DNA fragments and cloned into 
a promoter probing vector. The promoter-containing clones 
are identified for their ability to give colonies with 
various extents of colour on indicator plates. This 
should in principle give us only very strong promoters, 
but we discovered that by allowing the spacer sequences 
between the consensus sequences to vary in a random man- 
ner, the strength of the resulting promoters are modu- 
lated. In fact, using this method we obtained promoter 
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libraries, spanning the entire range of promoter activi- 
ties , that can become of interest, in small steps of ac- 
tivity change. 

Optimization of gene expression could then be achieved as 
follows: 1) From the promoter library one chooses promot- 
ers that have e.g. 25%, 50%, 100%, 200% and 400% of the 
strength of the wild-type promoter. 2) Then, these pro- 
moters are cloned in place of the wild-type promoter up- 
stream of the gene of interest. 3) The magnitude of the 
variable to be optimized (e.g. the flux through a path- 
way)- that is obtained with each of these five constructs 
is then measured and the optimal construct is used di- 
rectly as the production organism. It may be necessary to 
fine-tune the expression further or to expand the range 
of promoter activities. A direct advantage of this system 
over the inducible systems described above is that once 
the optimal promoter activity lias been determined, the 
strain is in principle ready for use in the fermentation 
process . 

Often it is desirable to activate gene expression to a 
certain extent and at a certain stage of a fermentation, 
e.g. because the g^ne product that is expressed inhibits 
the growth of th" rolls. It is then useful to combine the 
above tochnique f<*r obtaining promoters of different 
strength with sonu- royulatory mechanism, e.g. so that the 
promoter" will bo activated by a change in the pH , tem- 
perature or growth phasr>. 

Thus, in a specif ic embodiment of the invention, as il- 
lustrated in Example 2, the above approach is used in 
combination with specific regulatory DNA sequences to 
generate a library of promoters that have a specific 
regulatory feature in common. 
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In addition to prokaryotes, eukaryotes (yeast and other 
fungi as well as mammalian cell lines, ) are important mi- 
croorganisms for production of a range of organic com- 
pounds and various proteins. It is therefore of interest 
5 to develop the above approach for modulating gene expres- 
sion in these organisms as well. Thus, in another spe- 
cific embodiment of the invention, as illustrated in "Ex- 
ample 3, the the above technique for obtaining promoters 
of different strength is elaborated for the bakers yeast, 
l<> Saccharomyces cerevisiae . 

EXAMPLE 1 

Design of a degenerated oligonucleotide for a L. lactis 
15 promoter library. 

According to the literature (see review in de Vos & Si- 
mons, 1994 ), strong promoters in L . lactis tend to have 
the following nucleotide sequences in common (numbers re- 

2n fer to the position relative to the transcription initia- 
tion site, which is given number +1): -12 to -7: TATAAT; 
-15 to -14: TG; -35 to -30: TTGACA . The spacing between 
10 and -35 seems to be 17 nucleotides. However, closer 
comparison of the promoter sequences that have been pub- 

25 lished for L. I net is reveals that in a number of posi- 
tions besides th^ ones mentioned above, nucleotides are 
more or Jess well conserved. Some of these positions are: 
-1: A; -3: A or T (=W); -6: A; -13: A or G (=R); -40 to - 
36: TATTC. In addition, Nilsson and Johansen (1994, BBA) 

W pointed out two motives, +1 to +8: GTACTGTT, and -44 to - 
41: AGTT, that appear to be well conserved between rela- 
tively strong promoters (promoters for transfer RNA and 
ribosomal RNA operons) from L . lactis. These motives may 
confer both strength and growth rate dependent expression 

35 from the promoter. 
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When these additional motives are included, one arrives 
at the following 53 nucleobase degenerated sequence for 
an efficient promoter in L . lactis. Out of these 53 nu- 
c leobases , 34 bases are conserved , two are semi -conserved 
(R and W) and 17 are allowed to vary randomly between the 
four nucleobases . 



5 f AGTT TATTC TTGACA NNNNNNNNNNNNNN TG R TATAAT ANNWNA GTACTGTT 3 ' 



In addition, this degenerated sequence is flanked by se- 
quences that specify multiple recognition sites for re- 
striction endonuc leases (multiple cloning site MCS), and 
the actual oligonucleotide mixture to be synthesized has 
the following degenerated sequence reported in SEQ ID No. 



Mbol 

Dpn I 

Alwl 
Iilal V 
BstYT 

Barnlli Ms el 

A 1 wl Af ] I I Ssp ! i I 

!...... 

^ * CGGGATrCTTA^AATATT^ 

Al ul 
I'vuT I. 
r:-pBII 

:> f - 1 
i Fn •!•;»] 
R. r 5 a \ <\ l"s \ I 

Sea i }■::.■ II Rbvl EcoP.J 
6 1 . . . 100 

GfvrATAATA!i;;v;;iAGTAC'; ' - : vaa'-t-^agctgaattcgg 3 1 



A mixture of oligonucleotides according to this specifi- 
cation was synthesized by Hobolth DNA synthesis. 



This oligonucleotide mixture is single stranded initially 
and must be converted into double stranded DNA fragments 
for the purpose of cloning. This was done by synthesizing 
a 10 base pair oligonucleotide, having a sequence comple- 
mentary to the 3' end of the promoter oligonucleotide. 
This oligonucleotide was then used as a primer for the 
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second strand synthesis by the Klenow fragment of DNA po- 
lymerase I in the presence of dATP, dCTP,. dGTP and dTTP . 
In this way the second DNA strand became exactly comple- 
mentary to the first DNA strand. 

The result is a mixture of 100 bp double stranded DNA 
fragments containing multiple recognition sites for re- 
striction endonuc leases in both the 3' and 5' end. These 
DNA fragments were digested with restriction endonucle- 

K) ases, in order to create DNA ends compatible with the 
ends of the vector DNA fragment, pAK80 (Israelsen et al . , 
1995"). pAK80 is a "shuttle vector", meaning that it has 
replication origins for propagation in both E . coll and 
L . lactis. In this way, the cloning steps can be conven- 

15 iently performed in E. coll, while the subsequent physio- 
logical experiments can be done in L. lactis. Further- 
more, pAK80 carries a promoterless betagalactosidase re- 
porter gene system (lacLM) downstream a multiple cloning 
site. Thus, pAK80 does not express the lacLM genes, un- 

2(» less a promoter is inserted in the multiple cloning site. 

Two cloning strategies were used for cloning the mixture 
of double stranded DNA fragments into the cloning vector 
pAK80 : 

25 

1) The mixture was digested with BamHI and Pstl and the 
vector pAK80 with Pstl and Bglll (Bglll is compatible 
with BamHI ) . 

M) 2) The mixture was digested with Sspl and Hindi and the 
vector pAK80 with Smal (all three enzymes produce blunt 
end DNA fragments). 

In both cases the vector DNA was further treated with 
35 Calf Intestine Phosphatase (CIP) to prevent religation of 
the cloning vector. Subsequently, promoter fragment and 
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vector were ligated overnight at 16 °C using T4 DNA li- 
gase *and standard ligation conditions. 

Ligation mixtures were transformed into E. coli K-12 
Dlac, with selection for erythromycin resistance. Cells 
of the E. coli K-12 strain, MT102, were made competent 
using standard treatment with Ca + + ions (Maniatis et al . , 
1982). Ligation mixtures were then transformed into these 
cells using a standard transformation procedure (Maniatis 
et al . , 1982), and the resulting clones were screened for 
betagalactosidase activity that will produce blue colo- 
nies" on plates containing X-gal ( 5-bromo-4 -chloro- 3-indo- 
ly 1-ft-D-ga lac tos ide ) . The transformation mixture was 
plated on LB plates containing 200 ng/ml erythromycin, 1% 
glycerol and 100 ng/ml X-gal. 150 erythromycin resistant 
trans formants were obtained, all white initially, but af- 
ter prolonged incubation (two weeks at 4 °C ) , 46 of these 
colonies had become light blue. Thus, using strategy 1) 
we found 17 blue colonies (CP30 to CP46), and using 
strategy 2) we found 29 blue colonies (CP1 to CP29). 

Plasmid DNA was isolated from each of these clones (CP1 
to CP46) and analysed by restriction enzyme mapping. 
Nearly all plasmids contained promoter fragments inserted 
into the MCS of pAK80, in the oriemtation that would di- 
rect transcription of the otherwise promoterless lacLM 
genes on this vector. 

These 46 plasmid DNA preparations were then transformed 
into L . lactis subspecies lactis MG1363 with selection 
for erythromycin resistance. Cells of the L . lactis sub- 
species lactis strain, MG1363 (Gasson, 1983) were made 
competent by growing the cells overnight in SGM17 medium, 
containing 2% glycine, as described by Holo and Nes 
(1989). Plasmid DNA from each of the 46 clones described 
above was then transformed into these cells using the 
electropora t ion procedure (Holo and Nes, 1989). After re- 
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generation, the cells were plated on SR plates containing 
2 ngVnil erythromycin. Subsequent screening for blue color 
on X-gal plates revealed large differences in betagalac- 
tosidase activity between the 46 clones; some clones gave 
dark blue colonies after 24 hours of incubation, others 
only light blue colonies after more than 1 week of incu- 
bation . 

The betagalac tos idase activities of liquid cultures of 
the 46 clones in MG1363 were then determined as described 
by Miller (1972) and modified by Israelsen et al . (1995). 
Cultures of the strain MG1363, each carrying one of the 
46 plasmid derivatives of pAK80, were grown in M17 medium 
supplemented with 1% glucose overnight at 30 °C . 25-100 
ul of these cultures were then used in the subsequent 
betaga 1 ac tos ida se assay, except in the case of the weak- 
est promoter clones, where 2 ml of culture was used 
(after 20 fold concentration by cent ri fugat ion ) . These 
results are shown in the figure. Apparently, there are 
very large differences in the efficiency of the cloned 
promoter fragments, and together these clones cover a 
range of promoter activities from 0.3 units of betagalac- 
tosidase activity to more than 2000 units. More interest- 
ing is it, however-, that the range is covered by small 
changes in activity. Therefore, these promoter fragments 
will allow us not. only to obtain a wide range of expres- 
sion of genes in /, . J act is, but also to tune the expres- 
sion of genes in /. . Lactis in small steps for the purpose 
of flux optimization. 

DNA sequencing of the 46 clones described above revealed 
that 64 % of the inserted promoter fragments carried the 
consensus sequences that was originally specified for the 
oligonucleotide design (see above), whereas the sequence 
of the remaining fragments deviated slightly from that 
sequence. Most of the promoter fragments that gave low 
activities in the betaga lac tos idase assay (70 units of 
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betagalactosidase or less) had either (A) no promoter 
fragment inserted, (B) an error in one of the consensus 
sequences, (C) a different length of -spacer between the 
consensus sequences. All the clones that gave high ac- 
tivities (more than 70 units) had the same sequence as 
specified by the oligonucleotide. This result is in ac- 
cordance with the "dogma", i.e. that changes in the con- 
sensus sequences have strong effects on the activity of a 
given promoter, and emphasizes the fact that a more sub- 
tle approach is needed in order to generate a promoter 
library that covers a range of activities in small steps. 
Clearly, if we would have allowed only changes in the 
consensus sequences and/or changes in the length of the 
spacer, instead of allowing the sequence of the spacers 
to vary randomly, only fairly weak promoter clones would 
have resulted, and the resulting library would not be 
suitable for the present purpose. 

Most often, for metabolic engineering purposes, rela- 
tively strong promoters are desired, but there may also 
be cases where rather weak promoters are needed. The 
relatively few errors that had occurred during synthesis 
of the above oligonucleotide mixture, were not intended 
to be present in the promoter fragments; but our data 
suggest that, it nwiy sometimes be useful to generate, de- 
liberately, a iiiixtw:- of oligonucleotides so that (stati- 
stically) onch o I i ' j o n 1 1 c 1 r» o t i d e have a few errors in the 
consensus soquem^s. If only weak promoters are needed, 
it may also be us»Mul to vise an oligonucleotide mixture 
in which the length of spacer is shorter than what is 
considereci to be opt i ma 1 or in which parts of the consen- 
sus sequences hav been changed deliberately. 



Enzymes used in the various enzymatic reactions above 
were obtained from and used as recommended by Pharmacia 
and Boehringer. 
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EXAMPLE 2 

Design of a degenerated oligonucleotide for a library of 
temperature regulated L. lactis promoters. 

This example illustrates the development of a temperature 
regulated promoter library for L. lactis. Heatshock regu- 
lated promoters in Gram positive bacteria seem to have a 
common DNA sequence (located a. few base pairs upstream of 
the -35 sequence) which is thought to be involved in the 
observed temperature regulation of the expression from 
such" promoters. The minimal extent of such a regulatory 
element seems to be 27 basepairs: 

5 ' -TTAGCACTCNNNNNNNNNGAGTGCTAA- 3 ' 

IR spacer IR 

containing a 9 bp (or longer) inverted repeat (IR) sepa- 
rated by 9 (or fewer) basepairs. It should therefore be 
possible to combine this inverted repeat with the ap- 
proach for obtaining constitutive promoters of different 
strength and thus obtain a series of promoters with vari- 
ous basal activities which can be induced several fold by 
changing the temperature of the culture medium. 

Therefore, an oligonucleotide was designed, which in- 
cludes the core part, (from position -35 to +6) of the se- 
quence from the constitutive promoter series above (see 
example 1 and SKU ID No, 1). The sequence upstream of the 
-35 hexamer has been replaced by the above inverted re- 
peat sequence, and the sequence downstream of position +6 
has also been modified, eliminating two conserved regions 
compared to example 1 (+1 to +8: GTACTGTT and -4 1 to -44: 
AGTT , which have been implicated in growth rate regula- 
tion). The sequence of the spacer (sp.l) between the two 
inverted DNA sequences in the inverted repeat was here 
allowed to vary randomly in order to see whether this had 
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any effect on the temperature regulation of the resulting 
promoters, e.g. how many fold they could be induced by 
changing the temperature. The importance of the spacing 
(sp.2) between the inverted repeat and the -35 hexamer is 
5 not known, but in principle this region may contribute to 
or modulate the heatshock response of promoters. In order 
to limit the number of parameters, however, we have cho- 
sen here to include a naturally occuring configuration 
(the dnaJ promoter from L . lactls; van Asseldonk et al., 
lo 1993): a short spacer sequence consisting of 5 times T. 

When' these sequences are combined, one arrives at the 
following 73 bp "consensus" sequence for a temperature 
regulated promoter in L. lactis . Out of these 73 bp, 45 
15 are conserved, two are semi-conserved (R and W) and 26 
are allowed to vary randomly between the four nucleo- 
bases . 

5 ' TTAGCACTCNNNNNNNNNGAGTGCTAATTTTT TTGACA NNNNNNNNNNNNNNTGR 
2<> IR sp. 1 IR sp. 2 

TATAAT ANNWNAGTACTG 3 ' 



In addition, this degenerated sequence was flanked by se- 
quences that specify multiple recognition sites for re- 
striction endonucleases (multiple cloning sites MCS), and 
the actual oligonucleotide mixture that is being synthe- 
sized has the following degenerated sequence reported in 
SEQ ID No. 2: 
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Mbol 
Dpnl 
Alwl 
MlalV 

5 BstYI Msel Msel 

BamHI Alul Sspl 
Alwl Hindlll Asel 
1 ..... 
5 ' CGGGATCCAAGCTTAATATTAATTAGCACTCNMWMMMM NNGAGTGCTAATTTTTTTGACA 
10 IR IR " -35 

Alul 
PvuII 
MspBII 

15 Sf cl EcoRI 

PstI Apol 
Rsal Fnu4HI Mael 
Seal Bbvl Xbal 
61. . A T . .113 

20 HHHHHHI niniiHtlHUTGG TATAAT Al-JNAl-IAGTACTGCAGCTGTCTAGA^'TCGG 3 * 

-15 -10 +1 

This oligonucleotide mixture was converted into double 
stranded DNA fragments (DSDF) cloned into the promoter 
25 cloning vector pAK80 as decribed in example 1, except 
that the DSDF mixture and the vector pAK80 were both di- 
gested with Hindlll and PstI . 



EXAMPLE 3 

30 

Design of a degenerated oligonucleotide for a Saccharomy- 
cgs cer&vi [s iae promoter library. 

The regulation of transcription initiation in the eukary- 
35 otic cell is somewhat more complex compared to the pro- 
karyote. The transcription start site is normally pre- 
ceded by a so-called TATA box that contains the consensus 
sequence TAT AAA or parts hereof, but unlike in the pro- 
karyote, the distance from the TATA box to the transcrip- 
-lo tion start site is much less defined. In Saccharomyces 
cerevis iae this distance is typically 40-120 nucleotides 
(Oliver and Warmington, 1989). The so-called -35 consen- 
sus hexamer which is found in many prokaryotic promoters 
is absent in Saccharomyces cerevislae and instead so- 
45 called upstream activation sequences ( UAS ) are found some 
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50-500 bp upstream of the transcription initiation site. 
These UAS are recognised by specific DNA binding proteins 
that can then act as activators of transcription initia- 
tion. For instance, the UAS sequence that is found up- 
5 stream of the genes involved in aminoacid biosynthesis, 
UASccNAp/ consists of a DNA sequence that specifies a 
binding site for the GCN4 protein, which activates the 
transcription of these genes (Hinnebusch, 1986). Some 
genes contain more than one copy of the UAS, but one 
10 seems to be sufficient for activation. The consensus se- 
quence for the GCN4 protein binding site is a short in- 
verted repeat, TGACTCA. 

A promoter in Saccharomyces cerevis iae that is activated 
15 by the GCN4 protein is the ARG8 promoter. In this pro- 
moter, there is only one copy of the UASccN4p sequence, 
and it is located 59 bp from the TATA box (we refer to 
this distance as spacer 1). Transcription initiation 
takes place within a 23 bp sequence (TI box, underlined) 
2<> which is located 38 bp downstream of the TATA box. This 
23 bp sequence also functions as the binding site for the 
arginine repressor, argR (Crabeel et al . , 1995). Thus, in 
this case the promoter is located within 136 bp, which 
makes the system attractive for developing a promoter li- 
25 brary with the degenerated oligo nucleotide approach de- 
scribed in example .1 and 2. 

A 199 bp oligonucleotide was designed, which includes, 
starting from the 5' end: an EcoRI restriction site (for 

30 use in the subsequent cloning strategy, see below), the 
consensus UASccN^p/ a 60 bp spacer (spacer 1), the con- 
sensus TATAAA sequence (TATA box), a 39 bp spacer (spacer 
2), the 23 bp repressor binding sequence and TI box. The 
spacer sequence between the TI box and the first codon of 

35 the ARG8 gene, might also contribute to the strength of 
the resulting promoters. However, in order to limit the 
number of parameters, we have chosen here to include the 
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native sequence from the ARG8 gene for this spacer, in- 
cluding also the first codon of the ARG8 gene, ATG . Fi- 
nally, a BawHI site was included, which is designed to 
give an in frame fusion with the betagalac tosidase re- 
5 porter gene, see below. 



The sequences of spacer 1 and spacer 2 were allowed to 
vary randomly, and the actual oligonucleotide mixture 
that is being synthesized has the following degenerated 
K) sequence reported in SEQ ID No. 3: 

Plel 
EcoP.I Hinfl 
Apol Maelll 

15 i 

5 ' CAGAATTCGTGACTCAimrmin-INMiniHmjn^ 
UAS G ciMp 

6 J 

2o i n j f i n i n i n n i i n n r 3 1 J h r j 1 1 t at atwvi h n * n n i h i ij i j r nn j r f r j r i i n h j r ii i i n j h in i in i rn j n r r in j ii ij ii r j m 

TATA-box 

Mss*I Tfil 
Af .1 I I Mao! t I Hinfl 

25 12 1 

U CTCTTAAGTGCAAGTGACTGCGAy V AVTTTTTTCCTTTGTTAGAATAATTCAAGAATCG 
TI box 

Mho I 

."n L'pni 

A 1 w I 
IMalV 
F*.«s tYI 
F'.amili 

35 AlwT 
!.* 1 -~i i II 
19 1 . i 

CTACCAA" :-ATGGATfrc- " ■ ' 



ARG8, 
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This oligonucleotide mixture was converted into double 
stranded DMA fragments (DSDF), using an oligonucleotide 
complementary to the last 23 bp of the 3' end of the 199 
bp degenerated oligonucleotide as described in example 1. 
45 Subsequently, it was cloned into either of the two pro- 
moter cloning vectors, pYLZ-2 and pYLZ-6 (Hermann et al . , 
1992 ), as follows: the DSDF mixture and the vector were 
both double-digested with EcoRI and BawHI, and the DSDF 
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were ligated to the vector DNA. The ligation mixture was 
transformed into E. coll as described in example 1, with 
selection for ampicillin resistance, and a large number 
of clones were obtained and pooled into a library of pro- 
5 moter clones. Plasmid DNA was isolated from this pool and 
transformed into S. cerevlslae , with selection for the 
URA3 marker carried by the vector. In addition to the 
ura3 mutation, the recipient yeast strain used has two 
mutations: one that gives constitutive expression of GCN4 

lo and one that inactivates the repressor, argR. (In this 
strain background the promoters resulting from the 199 bp 
oligonucleotide should be constitutive, which facilitates 
the initial screening for promoters of different 
strength, but if the plasmids with promoters of different 

15 strength are transformed into a wildtype strain of S. 
cerevls j ae , the promoters should be regulated by the GCN4 
regulon and by arginine.) 

The clones were screened for be taga lac tos idase activity 
2n as describeci (Gunronte, 1 983 ). 

Some yeast trans formants were selected for further stud- 
ies, and plasmids were rescued and characterized for the 
promoter structures . 

25 
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. SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Peter Ruhdal Jensen 

(B) STREET: Soegaardsvej 19 

(C) CITY: Gentofte 

(E) COUNTRY: Denmark 

(F) POSTAL CODE (ZIP) : DK-2820 

(ii) TITLE OF INVENTION: Artificial promoter libraries for 
microorganisms 

(iii) NUMBER OF SEQUENCES: 3 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

. (C) OPERATING SYSTEM: PC-DOS /MS - DOS 

(D) SOFTWARE: Patentln Release #1.0, Version 01.30 (EPO) 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA (genomic) 

(iii) HYPOTHETICAL: YES 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Lactococcus lactis 

(ix) FEATURE: 

(A) NAME/KEY: promoter 

(B) LOCATION:26. .82 

(C) IDENTIFICATION METHOD: experimental 

(D) OTHER INFORMATION: /evidence- EXPERIMENTAL 

/ s tanda rd_name= "Artificial promoter library" 
/not.e= "A degenerated sequence specifying a mixture of 
artificial promoters covering a vide range of expression 
in small steps in L. lactis" 

(ix) FEATURE: 

(A) NAME / KEY : mi sr_f eature 

(B) LOCATION: 3 1 . . 45 

(D) OTHER INFORMATION : / s t anda rd_name = "Consensus sequence" 

(ix) FEATURE: 

(A) NAME / KEY : mi s<;_f eature 

(B) LOCATION:60. .69 

(D) OTHER INFORMATION : / s tandard_name= "Consensus sequence" 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 74 . .82 

(D) OTHER INFORMATION: / s tanda rd_name= "Consensus sequence" 
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( ix) FEATURE: 

(A) NAME/KEY: -35_signal 
, (B) LOCATION: AO. . 45 

(D) OTHER INFORMATION: / s tanda rd_name = "-35 box" 

(ix) FEATURE: 

(A) NAME / KEY : -10_signal 

(B) LOCATION:63. .68 

(D) OTHER INFORMATION: / s tandard_name= "Pribnow box" 

(ix) FEATURE: 

(A) NAME / KEY : misc_recomb 

(B) LOCATION: 3 . . 25 

(C) IDENTIFICATION METHOD: experimental 

(D) OTHER INFORMATION : / evidence= EXPERIMENTAL 

/ s tanda rd_name= "Multiple cloning site" 
/label= MCS 

/note= "A sequence specifying recognition sites for the 
restriction endonucleases : NlalV, Bs'tYI, BamHI, Alwl , 
MboI.Dpnl, Aflll, Msel, Sspl , Ilsil." 

(ix)* FEATURE: 

"( A) NAME / KEY : misc_recomb 

(B) LOCATION: 74 . .98 

(C) IDENTIFICATION METHOD: experimental 

(D) OTHER INFORMATION: /evidence- EXPERIMENTAL 

/ s tanda rd_name= "Multiple cloning site" 
/label= MCS 

/note= "A sequence specifying recognition sites for the 
restriction endonucleases: Seal, Rsal, Hpal, Hindi, 
Msel, Sfcl, PstI, FnuAHI, Bbvl. PvuII, NspBII. Alul, 
EcoRI . " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
CGGGATCCTT AAGAATATTA TGCATNNNNN AGTTTATTCT TGACANNNNM NNNNNNNNNT 60 
GGTATAATAN NANAGTACTG TTAACTGCAG CTGAATTCGG 100 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: promoter 

(B) LOCATION: 23..95 

(D) OTHER INFORMATION: / s tanda rd_name= "Artificial promoter 
library" 

/note= "A degenerated sequence specifying a mixture of 
artificial temperature regulated promoters covering a 
wide range of expression in small steps in L. lactis" 



(ix) FEATURE : 

(A) NAME/KEY: mi sc_f ea tu re 
, (B) L0CATI0N:23. . 49 

(D) OTHER INFORMATION: / s tanda rd_name = "Sequence providing 
temperature regulation to promoters" 

/note= "This sequence comprising two inverted repeats 
separated by a short spacer provides temperature (heat 
shock) regulation to promoters in G ram- po s i t i ve bacteria" 

(ix) FEATURE: 

(A) NAME / KEY : misc_f eat.ure 

(B) L0CATI0N:50. .60 

(D) OTHER INFORMATION: / s t anda rd_name = "Consensus sequence" 

(ix) FEATURE: 

(A) NAME/KEY: mi s c_f e a t u re 

(B) LOCATION: 75. .84 

(D) OTHER INFORMATION : / s t anda rd_name= "Consensus sequence" 

(ix) FEATURE: 

(A) NAME / KEY : misc_feature 

(B) L0CATI0N:89. .95 

(D) OTHER INFORMATION :/ s tanda rd_name= "Consensus sequence" 

(ix) FEATURE: 

(A) NAME/KEY: -35__signal 

(B) LOCATION:55. .60 

(D) OTHER INFORMATION : / s tanda rd_name= "-35 box" 

(ix) FEATURE: 

(A) NAME / KEY : -10_signal 

( B) LOCATION : 78 . . 83 

(D) OTHER INFORMATION :/ s tanda rd_name= "Pribnow box" 

(ix) FEATURE: 

(A) NAME / KEY : rnisc_recomb 

(B) LOCATIONS . . 22 

(D) OTHER INFORMATION : / s t anda rd__name = "Multiple cloning site" 
/label= MCS 

/note= " A sequence specifying recognition sites for the 
restrict inn endonuc leases : NlalV, BstYI, BamHI , Alwl t 
Mbol, UpnI. HindUI, Alul, Msel (2 sites), Sspl, Asel" 

(ix) FEATURE: 

(A) NAME /KEY: m i s c _ r e c oinb 
( B) LOCATION : e c - . . Ill 

(D) OTHER INFORMATION: / s t anda rd_name = "Multiple cloning site" 
Mabpi- k-.:s 

/notP= " A sequence specifying recognition sites for the 
restrict inn endcuuu; leases : Seal, Rsal , Sfcl, PstI, 
FnuMii. BbvI.PvuII, NspBII, Alul, Xbal, Mael, EcoRI, 
Apo I '* 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
CGGGATCCAA GCTTAATATT AATTAOCACT CNNNNNNNN N GAGTGCTAAT TTTTTTGACA 60 
NNNNNNNNNN NNNNTGGTAT AATANNANAG TACTGCAGCT GTCTAGAATT CGG 113 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 199 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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<ii) MOLECULE TYPE: DNA (genomic) 

(iii) , HYPOTHETICAL: YES 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Saccha romy ce s cerevisiae 

(ix) FEATURE: 

(A) NAME / KEY : pro tein_bind 

(B) LOCATION : 10 . . 16 

(D) OTHER INFORMATION: / function= "Activating promoters in 
S. cerevisiae" 

/ bound_mo ie ty = " GCNA protein 11 

/ s tanda rd_name- "Upstream activating sequence" 
/label- UAS_GCN4p 

/note= "A DNA sequence that specifies a binding site for 
the GCN4 protein, which activates the transcription of 
genes involved in aminoacid synthesis in S. cerevisiae" 

(ix) FEATURE: 

(A) NAME/KEY: TATA_signal 

(B) LOCAT10M:67 . . 72 

(D) OTHER INFORMATION :/ s tanda rd_name- "TATA box" 

(ix) FEATURE: 

(A) NAME / KEY : mi sc_s igna 1 

(B) LOCATION: 122. . 1^4 

(D) OTHER INFORMATION: / function 5 "Transcription initiation" 
/ s t anda : c:_ name- " T I box" 

(ix) FEATURE: 

(A) NAME/KEY: p: t ein_bind 

(B) LOCATION: 17/ , . 1 '< 6 

(D) OTHER I NFOFMAT I ON : / hound_mo ie t.y - " Arginine rppressor 1 ' 
/ s tanila : ii_name - "arginine repressor binding site" 
/label- a:eR 

(ix) FEATURE: 

(A) NAME / KEY : v. : « : _RNA 

(B) LOCATION: !-.*.. 197 

(D) OTHER INFi-FMATION : / timet ion = "Spacer" 

/standa:.: nam*'*- "Fart of native sequence for ARG8 
£p»ne i :; firM codnn" 

(ix) FEATi'rK: 

(A) NAME /KEY: r . - :^ 
(P) I/'CATION: ^ . - 

(D) OTHER I Nr f*r ' '.A I ON : / ^ t anda rd_name= "Recognition site 
for : : : ion "ndonuc lease EcoRl" 

/ 1 a h p I - r i t ^ 

(ix) FEATURE: 

(A ) NAME / KEY : r : ret omb 

( B ) LOCATION : 1 ^: . . 1 9 7 

(D) OTHER I NFv'r MAT I ON : / s tandard_name= "Recognition site 
for r r> t : u t. ion endonuc lease BamHI" 
/label- F amfj 1 _s i t e 

(ix) FEATURE: 

(A) NAME / KEY : promoter 

(B) LOCATION: 10. . 192 

(D) OTHER INFORMATION : / s tanda rd_name- "Artificial promoter 
library" 

/note= "A degenerated sequence specifying a mixture of 
artificial promoters covering a wide range of expression 
in small steps in S. cerevisiae" 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CAGAATTCQT GACTCACTTA AAGTACATGA TCCGTCATTG TGCACTTTTT TACGAAATAA 60 

TGGCCTTTTT CCGCTCTATT TAAAAGCGTG AAAAAAAAAT TGCAACATGA AGAAAATTCG 120 

ACTCTTAAGT GCAAGTGACT GCGAACATTT TTTTCGTTTG TTAGAATAAT TCAAGAATCG 180 

CTACCAATCA TGGATCCCG 199 
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PATENT CLAIMS 

1. An artificial promoter library for a selected micro- 
organism or group of microorganisms, comprising a mixture 
of double stranded DNA fragments the sense strands of 
which comprise at least one consensus sequence or part(s) 
thereof of efficient promoters from said microorganism or 
group of microorganisms and surrounding or intermediate 
nucleotide sequences (spacers) of variable length in 
which the nucleotides are selected randomly among the nu- 
cleobases A, T, C and G. 

2. An artificial promoter library according to claim 1, 
wherein the sense strands of the double stranded DNA 
fragments include a regulatory DNA sequence imparting a 
specific regulatory feature to the promoters of the li- 
brary . 

3. An artificial promoter library according to claim 2, 
wherein the specific regulatory feature is activation by 
a change in the growth conditions. 

4. An artificial promoter library according to any one 
of claims 1-3, wherein the double stranded DNA fragments 
have sequences comprising one or more recognition sites 
for restriction pndonuc 1 eases added to their ends. 

5. An artificial promoter library according to claim 4, 
wherein the double stranded DNA fragments have sequences 
specifying multiple recognition sites for restriction en- 
donucleases (multiple cloning sites MCS) added to their 
ends . 

6. An artificial promoter library according to any one 
of claims 1-5, wherein the selected microrganism or group 
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of microorganisms is selected from the group consisting 
of prokaryotes . 

7. An artificial promoter library according to claim 6, 
5 wherein the selected microrganism or group of microorgan- 
isms is selected from the group consisting of lactic acid 
bacteria of the genus Laatococcus . 

8. An artificial promoter library according to claim 7 , 
lo wherein the selected microorganism is a strain of the 

species Lactococcus lactis. ' 

9. An artificial promoter library according to claim 7 
or 8, wherein said consensus sequences comprise the -35 

15 signal (-35 to -30): TTGACA or the -10 signal (-12 to • - 
7): TATAAT or a part of one of them comprising at least 4 
conserved nucleotides . 

10. An artificial promoter library according to claim 7 
2n or 8, wherein said consensus sequences comprise the -35 

signal (-35 to -30): TTGACA and the -10 signal (-12 to - 
7): TATAAT or parts of both comprising together at least 
6 conserve ci nucleotides. 

25 11. An artificial promoter library according to claim 9 
or 10, wherein said consensus sequences further comprise 
intervening conserved motifs. 

12. An artificial promoter library according to claim 
30 11, wherein said intervening conserved motifs are se- 
lected from the group consisting of the conserved motifs 
-44 to -4 1: AGTT , -40 to -36: TATTC and +1 to +8: 
GTACTGTT . 

35 13. An artificial promoter library according to any one 
of claims 10-12, wherein .the length of the spacer between 
the -35 signal and the -10 signal is 14-23 bp. 
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14. An artificial promoter library according to claim 
13, wherein the length of the spacer between the -35 sig- 
nal and the -10 signal is 16-18 bp. 

15. An artificial promoter library according to claim 
13, wherein the length of the spacer between the -35 sig- 
nal and the -10 signal is 17 bp. 

16. An artificial promoter library according to any one 
of claims 12-15, wherein the sense strands of the double 
stranded DNA fragments have the degenerated sequence 
shown in SEQ ID No. 1 with minor variations in the con- 
sensus sequences and spacer lengths . 

17. An artificial promoter library according to claim 
10, wherein the sense strands of the double stranded DNA 
fragments, including an upstream temperature regulatory 
DNA sequence, have the degenerated sequence shown in SEQ 
ID No. 2 with minor variations in the consensus sequences 
and spacer lengths. 

18. An artificial promoter library according to any one 
of claims 1-5, wherein the selected microrganism or group 
of microorganisms is selected from the group consisting 
of eukaryotic microorganisms. 

19. An artificial promoter library according to claim 

18, wherein the selected microrganism or group of micro- 
organisms is selected from the group consisting of 
yeasts, other fungi and mammalian cell lines. 

20. An artificial promoter library according to claim 

19, wherein the selected microorganism is a strain of the 
yeast species Saccharomyces cerevisiae. 

21. An artificial promoter library according to claim 

20, wherein said consensus sequences comprise a tran- 




31 



scription initiation signal (TI box) functioning in Sac- 
charomyces cerevislae . 

22. An artificial promoter library according * to claim 
5 21, wherein said consensus sequences further comprise a 

TATA box and/or at least one upstream activation sequence 
(UAS) upstream of the TI box. 

23. An artificial promoter library according to claim 
lo 22, wherein said consensus sequences comprise the TI box: 

CTCTTAAGTGCAAGTGACTGCGA, which also functions as the 
binding site for the arginine repressor, argR, the TATA 
box: TAT AAA, and the UAS GC N4p : TGACTCA . 

15 24. An artificial promoter library according to claim 23, 
wherein the sense strands of the double stranded DNA 
fragments have the degenerated sequence shown in SEQ ID 
No. 3 with minor variations in the consensus sequences 
and spacer lengths . 

2<) 

25. A method of constructing an artificial promoter li- 
brary for a selected microorganism or group of microor- 
ganisms, which comprises selecting at least one consensus 
sequence of efficient promoters from said microorganism 

2? or group of microorganisms; synthesizing a mixture of 
single stranded DNA sequences comprising said consensus 
sequence(s) or pari, (s) thereof and surrounding or inter- 
mediate nucleotide sequences (spacers)of variable length 
in which the nucleotides are selected randomly among the 

M) nucleobasos A, T, C and G; and converting the single 
stranded DNA sequences into double stranded DNA fragments 
by second strand synthesis. 

26. A method according to claim 25, wherein the single 
^? stranded DNA sequences include a regulatory DNA sequence 

imparting a specific regulatory feature to the promoters 
of the library. 
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27. A method according to claim 26, wherein the specific 
regulatory feature is activation by a change in the 
growth conditions 

28. A method according to any one of claims 25-27 , 
wherein sequences specifying one or more recognition 
sites for restriction endonucleases are added to the ends 
of the single stranded DNA sequences in the synthesis, or 
linkers comprising such restriction sites are ligated to 
the ends of the double stranded DNA fragments. 

29. * A method according to claim 28, wherein sequences 
specifying multiple recognition sites for restriction en- 
donucleases are added to the ends of the single stranded 
DNA sequences in the synthesis, or linkers comprising 
such multiple cloning sites (MCS) are ligated to the ends 
of the double stranded DNA fragments. 

30. A method according to any one of claims 25-29 , 
wherein the selected microrganism or group of microorgan- 
isms is selected from the group consisting of prokaryo- 
tes . 

31. A method according to claim 30, wherein the selected 
microrgan i sin or group of microorganisms is selected from 
the group consist i nu of lactic acid bacteria of the genus 
Lac tococc u s . 

32. A method according to claim 31, wherein the selected 
microorganism is a strain of the species Lactococcus lac- 
tls . 

33. A method according to claim 3 1 or 32, wherein said 
consensus sequences comprise the -35 signal (-35 to -30): 
TTGACA or the -10 signal (-12 to -7): TATAAT or a part of 
one of them comprising at least 4 conserved nucleotides. 
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34. A method according to claim 31 or 32 , wherein said 
consensus sequences comprise the -35 signal (-35 to -30): 
TTGACA and the -10 signal (-12 to -7): TATAAT or parts of 
both comprising together at least 6 conserved nucleo- 
tides . 



35. A method according to claim 33 or 34, wherein said 
consensus sequences further comprise intervening con- 
served motifs. 

36. A method according to claim 35, wherein said inter- 
vening conserved motifs are selected from the group con- 
sisting of the conserved motifs -44 to -41: AGTT , -40 to 
-36: TATTC and +1 to +8: GTACTGTT . 

37. A method according to any one of claims 34-36 , 
wherein the length of the spacer between the -35 signal 
and the -10 signal is 14-23 bp. 

38. A method according to claim 37, wherein the length 
of the spacer between the -35 signal and the -10 signal 
is 16-18 bp. 

39. A method according to claim 37, wherein the length 
of the spacer between the -35 signal and the -10 signal 
is 17 bp. 



40. A method according to any one of claims 34-39 , 
wherein the mixture of single stranded DNA sequences has 
the degenerated sequence shown in SEQ ID No. 1 with minor 
variations in the consensus sequences and spacer lengths. 

41. A method according to any one of claims 34, 35 and 
37-39, wherein the mixture of single stranded DNA se- 
quences including an upstream temperature regulatory DNA 
sequence has the degenerated sequence shown in SEQ ID No. 
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2 with minor variations in the consensus sequences and 
spacer lengths . 

42. A method according to any one of claims 25-29 , 
wherein the selected microrganism or group of microorgan- 
isms is selected from the group consisting of eukaryotic 
microorganisms. 

43. A method according to claim 42, wherein the selected 
microrganism or group of microorganisms is selected from 
the group consisting of yeasts, other fungi and mammalian 
cell lines . 

44. A method according to claim 43, wherein the selected 
microorganism is a strain of the yeast species Saccharo- 
myces cerevisiae . 

45. A method according to claim 44, wherein said consen- 
sus sequences comprise a transcription initiation signal 
(TI box) functioning in Saccharomyces cerevisiae. 

46. A method according to claim 45, wherein said consen- 
sus sequences further comprise a TATA box and/or at least 
one upstream activation sequence ( UAS ) upstream of the TI 
box . 

47. An artificial promoter library according to claim 
46, wherein said consensus sequences comprise the TI box: 
CTCTTAAGTGC AAGTGACTGCGA , which also functions as the 
binding site for the arginine repressor, argR, the TATA 
box: TATAAA, and the UAS GC na p = TGACTCA . 

48. An artificial promoter library according to claim 47, 
wherein the sense strands of the double stranded DNA 
fragments have the degenerated sequence shown in SEQ ID 
No. 3 with minor variations in the consensus sequences 
and spacer lengths. 



49. A method of optimizing the expression of a recombi- 
nant* heterologous gene in a selected microorganism, which 
comprises transforming or transfecting the microorganism 
with a set of vectors each including said heterologous 
gene under the control of at least one member of an arti- 
ficial promoter library according to any one of claims 1- 
24 or constructed by the method according to any one of 
claims 25-48, said set of vectors covering a wide range 
of promoter activities in relatively small steps, growing 
the selected clones and screening them to find the one 
showing maximum metabolic flux of the product encoded by 
said heterologous gene. 
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I den ovennzevnte ansogning, som blev indleveret den 23. august 1996, er vi blevet opmasrk- 
somme pa nogle abenbare fejl, som vi hermed onsker at rette: 

I SEQUENCE LISTING, (2) INFORMATION FOR SER ID N03, (xi) SEQUENCE 
DESCRIPTION: SEQ ID NO 3 pa side 27 var der ved en fejltagelse indfort et udgangsoligo- 
nucleotid for degenereringen af sekvensen som angivet i Example 3. Da den degenererede se- 
kvens. som skal va^re SEQ ID No 3, tydeligt er angivet i Example 3 pa side 19, har vi nirind- 
fort denne korrekte SEQ ID No 3 i SEQUENCE LISTING pa side 27. 

Desuden var krav 47 og 48, som var kopicret fra krav 23 og 24, ved en fejltagelse ikke blevet 
rettet pa t; A method" At de skullc v:vre det, fremgar imidlertid klart af, at de henviser til hhv. 
krav 46 og 47, og vi har derfor nu rettet "An artificial promoter library 17 til "A method" i krav 

47 og48. 

Vi vedlxgger en ny side 27 met! den rigtige SEQ ID No. 3 og en ny side 34, hvor krav 47 og 

48 er rettet som forklarct ovenfor 

Desuden vedlsegges en diskette med sekvenslisten efter Patentin-systemet som tekstfil: 
2960322. seq. 
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Endelig vedlaegges den manglende overdragelseserklaering. 



Med venlig hilsen 

Hofman-Bang & Boutard, Lehmann & Ree A/S 



Bilag 

Ny side 27 in duplo 
Ny side 34 in duplo 
Diskette 

Overdragelseserklaering 




27 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CAGAATTCGT GACTCANNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 60 

NNNNNNNNNN NNNNNNTATA AANNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 120 

NCTCTTAAGT GCAAGTGACT GCGAACATTT TTTTCGTTTG TTAGAATAAT TCAAGAATCG 180 

CTACCAATCA TGGATCCCG 199 
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2 with minor variations in the consensus sequences and 
spacer lengths . 

42. A method according to any one of claims 25-29 , 
5 wherein the selected microrganism or group of microorgan- 
isms is selected from the group consisting of eukaryotic 
microorganisms . 

43. A method according to claim 42, wherein the selected 
10 microrganism or group of microorganisms is selected from 

the group consisting of yeasts, other fungi and mammalian 
cell- lines . 



44. A method according to claim 43, wherein the selected 
15 microorganism is a strain of the yeast species Saccharo- 

rrtyces cerevisiae . 

45. A method according to claim 44, wherein said consen- 
sus sequences comprise a transcription initiation signal 

2() (TI box) functioning in Saccharomyces cerevisiae . 

46. A method according to claim 45, wherein said consen- 
sus sequences further comprise a TATA box and/or at least 
one upstream activation sequence (UAS) upstream of the TI 

25 box. 



47. A method according to claim 46, wherein said consen- 
sus sequences comprise the TI box: 

CTCTTAAGTGCAAGTGACTGCGA, which also functions as the 
30 binding site for the arginine repressor, argR, the TATA 
box: TATAAA, and the UAS G CNAp : TGACTCA . 

48. A method according to claim 47, wherein the sense 
strands of the double stranded DNA fragments have the de- 

35 generated sequence shown in SEQ ID No. 3 with minor vari- 
ations in the consensus sequences and spacer lengths. 



